Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 120
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38648121

RESUMO

The selective pressure of pathogen-host symbiosis drives adaptations. How these interactions shape the metabolism of pathogens is largely unknown. Here, we use comparative genomics to systematically analyse the metabolic networks of oomycetes, a diverse group of eukaryotes that includes saprotrophs as well as pathogens of animal- and plant pathogens, the latter causing devastating diseases with significant economic and/or ecological impact. In our analyses of 44 oomycete species, we uncover considerable variation in metabolism that can be linked to lifestyle differences. Comparisons of metabolic gene content reveal that plant pathogenic oomycetes have a bipartite metabolism consisting of a conserved core and an accessory set. The accessory set can be associated with the degradation of defence compounds produced by plants when challenged by pathogens. Obligate biotrophic oomycetes have smaller metabolic networks, and taxonomically distantly related biotrophic lineages display convergent evolution by repeated gene losses in both the conserved as well as the accessory set of metabolism. When investigating to what extent the metabolic networks in obligate biotrophs differ from those in hemibiotrophic plant pathogens, we observe that the losses of metabolic enzymes in obligate biotrophs are not random and that gene losses predominantly influence the terminal branches of the metabolic networks. Our analyses represent the first metabolism-focused comparison of oomycetes at this scale and will contribute to a better understanding of the evolution of oomycete metabolism in relation to lifestyle adaptation.

3.
Nat Nanotechnol ; 2024 Feb 13.
Artigo em Inglês | MEDLINE | ID: mdl-38351230

RESUMO

Proteins are the primary functional actors of the cell. While proteoform diversity is known to be highly biologically relevant, current protein analysis methods are of limited use for distinguishing proteoforms. Mass spectrometric methods, in particular, often provide only ambiguous information on post-translational modification sites, and sequences of co-existing modifications may not be resolved. Here we demonstrate fluorescence resonance energy transfer (FRET)-based single-molecule protein fingerprinting to map the location of individual amino acids and post-translational modifications within single full-length protein molecules. Our data show that both intrinsically disordered proteins and folded globular proteins can be fingerprinted with a subnanometer resolution, achieved by probing the amino acids one by one using single-molecule FRET via DNA exchange. This capability was demonstrated through the analysis of alpha-synuclein, an intrinsically disordered protein, by accurately quantifying isoforms in mixtures using a machine learning classifier, and by determining the locations of two O-GlcNAc moieties. Furthermore, we demonstrate fingerprinting of the globular proteins Bcl-2-like protein 1, procalcitonin and S100A9. We anticipate that our ability to perform proteoform identification with the ultimate sensitivity may unlock exciting new venues in proteomics research and biomarker-based diagnosis.

4.
Plant J ; 117(4): 1281-1297, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37965720

RESUMO

Phytoplasmas are pathogenic bacteria that reprogram plant host development for their own benefit. Previous studies have characterized a few different phytoplasma effector proteins that destabilize specific plant transcription factors. However, these are only a small fraction of the potential effectors used by phytoplasmas; therefore, the molecular mechanisms through which phytoplasmas modulate their hosts require further investigation. To obtain further insights into the phytoplasma infection mechanisms, we generated a protein-protein interaction network between a broad set of phytoplasma effectors and a large, unbiased collection of Arabidopsis thaliana transcription factors and transcriptional regulators. We found widespread, but specific, interactions between phytoplasma effectors and host transcription factors, especially those related to host developmental processes. In particular, many unrelated effectors target specific sets of TCP transcription factors, which regulate plant development and immunity. Comparison with other host-pathogen protein interaction networks shows that phytoplasma effectors have unusual targets, indicating that phytoplasmas have evolved a unique and unusual infection strategy. This study contributes a rich and solid data source that guides further investigations of the functions of individual effectors, as demonstrated for some herein. Moreover, the dataset provides insights into the underlying molecular mechanisms of phytoplasma infection.


Assuntos
Arabidopsis , Phytoplasma , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Plantas/metabolismo , Arabidopsis/metabolismo , Mapeamento de Interação de Proteínas , Doenças das Plantas/microbiologia
5.
J Mass Spectrom ; 58(6): e4951, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-37259491

RESUMO

In this work, we introduce the application of proton transfer reaction mass spectrometry (PTR-MS) for the selection of improved terpene synthase mutants. In comparison with gas chromatography mass spectrometry (GC-MS)-based methods, PTR-MS could offer advantages by reduction of sample preparation steps and analysis time. The method we propose here allows for minimal sample preparation and analysis time and provides a promising platform for the high throughput screening (HTS) of large enzyme mutant libraries. To investigate the feasibility of a PTR-MS-based screening method, we employed a small library of Callitropsis nootkatensis valencene synthase (CnVS) mutants. Bacterial cultures expressing enzyme mutants were subjected to different growth formats, and headspace terpenes concentrations measured by PTR-Qi-ToF-MS were compared with GC-MS, to rank the activity of the enzyme mutants. For all cultivation formats, including 96 deep well plates, PTR-Qi-ToF-MS resulted in the same ranking of the enzyme variants, compared with the canonical format using 100 mL flasks and GC-MS analysis. This study provides a first basis for the application of rapid PTR-Qi-ToF-MS detection, in combination with multi-well formats, in HTS screening methods for the selection of highly productive terpene synthases.


Assuntos
Prótons , Compostos Orgânicos Voláteis , Ensaios de Triagem em Larga Escala , Espectrometria de Massas/métodos , Terpenos , Compostos Orgânicos Voláteis/análise
6.
Bioinform Adv ; 3(1): vbad017, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36818730

RESUMO

Summary: With its candybar form factor and low initial investment cost, the MinION brought affordable portable nucleic acid analysis within reach. However, translating the electrical signal it outputs into a sequence of bases still requires mid-tier computer hardware, which remains a caveat when aiming for deployment of many devices at once or usage in remote areas. For applications focusing on detection of a target sequence, such as infectious disease monitoring or species identification, the computational cost of analysis may be reduced by directly detecting the target sequence in the electrical signal instead. Here, we present baseLess, a computational tool that enables such target-detection-only analysis. BaseLess makes use of an array of small neural networks, each of which efficiently detects a fixed-size subsequence of the target sequence directly from the electrical signal. We show that baseLess can accurately determine the identity of reads between three closely related fish species and can classify sequences in mixtures of 20 bacterial species, on an inexpensive single-board computer. Availability and implementation: baseLess and all code used in data preparation and validation are available on Github at https://github.com/cvdelannoy/baseLess, under an MIT license. Used validation data and scripts can be found at https://doi.org/10.4121/20261392, under an MIT license. Supplementary information: Supplementary data are available at Bioinformatics Advances online.

7.
Comput Struct Biotechnol J ; 21: 630-643, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36659927

RESUMO

Recent breakthroughs in protein structure prediction demarcate the start of a new era in structural bioinformatics. Combined with various advances in experimental structure determination and the uninterrupted pace at which new structures are published, this promises an age in which protein structure information is as prevalent and ubiquitous as sequence. Machine learning in protein bioinformatics has been dominated by sequence-based methods, but this is now changing to make use of the deluge of rich structural information as input. Machine learning methods making use of structures are scattered across literature and cover a number of different applications and scopes; while some try to address questions and tasks within a single protein family, others aim to capture characteristics across all available proteins. In this review, we look at the variety of structure-based machine learning approaches, how structures can be used as input, and typical applications of these approaches in protein biology. We also discuss current challenges and opportunities in this all-important and increasingly popular field.

8.
Nucleic Acids Res ; 51(5): 2363-2376, 2023 03 21.
Artigo em Inglês | MEDLINE | ID: mdl-36718935

RESUMO

It has been known for decades that codon usage contributes to translation efficiency and hence to protein production levels. However, its role in protein synthesis is still only partly understood. This lack of understanding hampers the design of synthetic genes for efficient protein production. In this study, we generated a synonymous codon-randomized library of the complete coding sequence of red fluorescent protein. Protein production levels and the full coding sequences were determined for 1459 gene variants in Escherichia coli. Using different machine learning approaches, these data were used to reveal correlations between codon usage and protein production. Interestingly, protein production levels can be relatively accurately predicted (Pearson correlation of 0.762) by a Random Forest model that only relies on the sequence information of the first eight codons. In this region, close to the translation initiation site, mRNA secondary structure rather than Codon Adaptation Index (CAI) is the key determinant of protein production. This study clearly demonstrates the key role of codons at the start of the coding sequence. Furthermore, these results imply that commonly used CAI-based codon optimization of the full coding sequence is not a very effective strategy. One should rather focus on optimizing protein production via reducing mRNA secondary structure formation with the first few codons.


Assuntos
Escherichia coli , Aprendizado de Máquina , Distribuição Aleatória , Códon/genética , Códon/metabolismo , RNA Mensageiro/metabolismo , Escherichia coli/genética , Escherichia coli/metabolismo , Biossíntese de Proteínas
9.
Plant J ; 112(5): 1298-1315, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36239071

RESUMO

Photosynthesis is a key process in sustaining plant and human life. Improving the photosynthetic capacity of agricultural crops is an attractive means to increase their yields. While the core mechanisms of photosynthesis are highly conserved in C3 plants, these mechanisms are very flexible, allowing considerable diversity in photosynthetic properties. Among this diversity is the maintenance of high photosynthetic light-use efficiency at high irradiance as identified in a small number of exceptional C3 species. Hirschfeldia incana, a member of the Brassicaceae family, is such an exceptional species, and because it is easy to grow, it is an excellent model for studying the genetic and physiological basis of this trait. Here, we present a reference genome of H. incana and confirm its high photosynthetic light-use efficiency. While H. incana has the highest photosynthetic rates found so far in the Brassicaceae, the light-saturated assimilation rates of closely related Brassica rapa and Brassica nigra are also high. The H. incana genome has extensively diversified from that of B. rapa and B. nigra through large chromosomal rearrangements, species-specific transposon activity, and differential retention of duplicated genes. Duplicated genes in H. incana, B. rapa, and B. nigra that are involved in photosynthesis and/or photoprotection show a positive correlation between copy number and gene expression, providing leads into the mechanisms underlying the high photosynthetic efficiency of these species. Our work demonstrates that the H. incana genome serves as a valuable resource for studying the evolution of high photosynthetic light-use efficiency and enhancing photosynthetic rates in crop species.


Assuntos
Brassica rapa , Brassicaceae , Humanos , Brassicaceae/metabolismo , Fotossíntese/genética , Produtos Agrícolas , Fenótipo
10.
G3 (Bethesda) ; 12(11)2022 11 04.
Artigo em Inglês | MEDLINE | ID: mdl-36149290

RESUMO

Expression quantitative trait locus mapping has been widely used to study the genetic regulation of gene expression in Arabidopsis thaliana. As a result, a large amount of expression quantitative trait locus data has been generated for this model plant; however, only a few causal expression quantitative trait locus genes have been identified, and experimental validation is costly and laborious. A prioritization method could help speed up the identification of causal expression quantitative trait locus genes. This study extends the machine-learning-based QTG-Finder2 method for prioritizing candidate causal genes in phenotype quantitative trait loci to be used for expression quantitative trait loci by adding gene structure, protein interaction, and gene expression. Independent validation shows that the new algorithm can prioritize 16 out of 25 potential expression quantitative trait locus causal genes within the top 20% rank. Several new features are important in prioritizing causal expression quantitative trait locus genes, including the number of protein-protein interactions, unique domains, and introns. Overall, this study provides a foundation for developing computational methods to prioritize candidate expression quantitative trait locus causal genes. The prediction of all genes is available in the AraQTL workbench (https://www.bioinformatics.nl/AraQTL/) to support the identification of gene expression regulators in Arabidopsis.


Assuntos
Arabidopsis , Arabidopsis/genética , Locos de Características Quantitativas , Mapeamento Cromossômico , Fenótipo , Algoritmos
11.
Nat Commun ; 13(1): 5402, 2022 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-36104339

RESUMO

Single-molecule FRET (smFRET) is a versatile technique to study the dynamics and function of biomolecules since it makes nanoscale movements detectable as fluorescence signals. The powerful ability to infer quantitative kinetic information from smFRET data is, however, complicated by experimental limitations. Diverse analysis tools have been developed to overcome these hurdles but a systematic comparison is lacking. Here, we report the results of a blind benchmark study assessing eleven analysis tools used to infer kinetic rate constants from smFRET trajectories. We test them against simulated and experimental data containing the most prominent difficulties encountered in analyzing smFRET experiments: different noise levels, varied model complexity, non-equilibrium dynamics, and kinetic heterogeneity. Our results highlight the current strengths and limitations in inferring kinetic information from smFRET trajectories. In addition, we formulate concrete recommendations and identify key targets for future developments, aimed to advance our understanding of biomolecular dynamics through quantitative experiment-derived models.


Assuntos
Benchmarking , Transferência Ressonante de Energia de Fluorescência , Transferência Ressonante de Energia de Fluorescência/métodos , Cinética , Modelos Teóricos
12.
Bioinformatics ; 38(18): 4403-4405, 2022 09 15.
Artigo em Inglês | MEDLINE | ID: mdl-35861394

RESUMO

SUMMARY: The ever-increasing number of sequenced genomes necessitates the development of pangenomic approaches for comparative genomics. Introduced in 2016, PanTools is a platform that allows pangenome construction, homology grouping and pangenomic read mapping. The use of graph database technology makes PanTools versatile, applicable from small viral genomes like SARS-CoV-2 up to large plant or animal genomes like tomato or human. Here, we present our third major update to PanTools that enables the integration of functional annotations and provides both gene-level analyses and phylogenetics. AVAILABILITY AND IMPLEMENTATION: PanTools is implemented in Java 8 and released under the GNU GPLv3 license. Software and documentation are available at https://git.wur.nl/bioinformatics/pantools. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
COVID-19 , SARS-CoV-2 , Humanos , Filogenia , SARS-CoV-2/genética , Software , Genoma Viral
14.
Mol Biol Evol ; 39(1)2022 01 07.
Artigo em Inglês | MEDLINE | ID: mdl-34597400

RESUMO

Meiotic recombination is a biological process of key importance in breeding, to generate genetic diversity and develop novel or agronomically relevant haplotypes. In crop tomato, recombination is curtailed as manifested by linkage disequilibrium decay over a longer distance and reduced diversity compared with wild relatives. Here, we compared domesticated and wild populations of tomato and found an overall conserved recombination landscape, with local changes in effective recombination rate in specific genomic regions. We also studied the dynamics of recombination hotspots resulting from domestication and found that loss of such hotspots is associated with selective sweeps, most notably in the pericentromeric heterochromatin. We detected footprints of genetic changes and structural variants, among them associated with transposable elements, linked with hotspot divergence during domestication, likely causing fine-scale alterations to recombination patterns and resulting in linkage drag.


Assuntos
Domesticação , Solanum lycopersicum , Elementos de DNA Transponíveis/genética , Solanum lycopersicum/genética , Melhoramento Vegetal , Recombinação Genética
15.
Haematologica ; 107(1): 143-153, 2022 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-33596640

RESUMO

T-cell prolymphocytic leukemia (T-PLL) is mostly characterized by aberrant expansion of small- to medium-sized prolymphocytes with a mature post-thymic phenotype, high aggressiveness of the disease and poor prognosis. However, T-PLL is more heterogeneous with a wide range of clinical, morphological, and molecular features, which occasionally impedes the diagnosis. We hypothesized that T-PLL consists of phenotypic and/or genotypic subgroups that may explain the heterogeneity of the disease. Multi-dimensional immuno-phenotyping and gene expression profiling did not reveal clear T-PLL subgroups, and no clear T-cell receptor a or ß CDR3 skewing was observed between different T-PLL cases. We revealed that the expression of microRNA (miRNA) is aberrant and often heterogeneous in T-PLL. We identified 35 miRNA that were aberrantly expressed in T-PLL with miR-200c/141 as the most differentially expressed cluster. High miR- 200c/141 and miR-181a/181b expression was significantly correlated with increased white blood cell counts and poor survival. Furthermore, we found that overexpression of miR-200c/141 correlated with downregulation of their targets ZEB2 and TGFßR3 and aberrant TGFß1- induced phosphorylated SMAD2 (p-SMAD2) and p-SMAD3, indicating that the TGFß pathway is affected in T-PLL. Our results thus highlight the potential role for aberrantly expressed oncogenic miRNA in T-PLL and pave the way for new therapeutic targets in this disease.


Assuntos
Leucemia Prolinfocítica de Células T , MicroRNAs , Perfilação da Expressão Gênica , Humanos , Leucemia Prolinfocítica de Células T/diagnóstico , Leucemia Prolinfocítica de Células T/genética , Leucemia Prolinfocítica de Células T/terapia , Linfócitos , MicroRNAs/genética , Fator de Crescimento Transformador beta , Homeobox 2 de Ligação a E-box com Dedos de Zinco/genética
16.
F1000Res ; 11: 802, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37035464

RESUMO

Background: Many studies have demonstrated the utility of machine learning (ML) methods for genomic prediction (GP) of various plant traits, but a clear rationale for choosing ML over conventionally used, often simpler parametric methods, is still lacking. Predictive performance of GP models might depend on a plethora of factors including sample size, number of markers, population structure and genetic architecture. Methods: Here, we investigate which problem and dataset characteristics are related to good performance of ML methods for genomic prediction. We compare the predictive performance of two frequently used ensemble ML methods (Random Forest and Extreme Gradient Boosting) with parametric methods including genomic best linear unbiased prediction (GBLUP), reproducing kernel Hilbert space regression (RKHS), BayesA and BayesB. To explore problem characteristics, we use simulated and real plant traits under different genetic complexity levels determined by the number of Quantitative Trait Loci (QTLs), heritability ( h 2 and h 2 e ), population structure and linkage disequilibrium between causal nucleotides and other SNPs. Results: Decision tree based ensemble ML methods are a better choice for nonlinear phenotypes and are comparable to Bayesian methods for linear phenotypes in the case of large effect Quantitative Trait Nucleotides (QTNs). Furthermore, we find that ML methods are susceptible to confounding due to population structure but less sensitive to low linkage disequilibrium than linear parametric methods. Conclusions: Overall, this provides insights into the role of ML in GP as well as guidelines for practitioners.


Assuntos
Genômica , Melhoramento Vegetal , Teorema de Bayes , Genômica/métodos , Locos de Características Quantitativas/genética , Aprendizado de Máquina , Plantas/genética
17.
iScience ; 24(11): 103239, 2021 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-34729466

RESUMO

Single-molecule protein identification is an unrealized concept with potentially ground-breaking applications in biological research. We propose a method called FRET X (Förster Resonance Energy Transfer via DNA eXchange) fingerprinting, in which the FRET efficiency is read out between exchangeable dyes on protein-bound DNA docking strands and accumulated FRET efficiencies constitute the fingerprint for a protein. To evaluate the feasibility of this approach, we simulated fingerprints for hundreds of proteins using a coarse-grained lattice model and experimentally demonstrated FRET X fingerprinting on model peptides. Measured fingerprints are in agreement with our simulations, corroborating the validity of our modeling approach. In a simulated complex mixture of >300 human proteins of which only cysteines, lysines, and arginines were labeled, a support vector machine was able to identify constituents with 95% accuracy. We anticipate that our FRET X fingerprinting approach will form the basis of an analysis tool for targeted proteomics.

18.
iScience ; 24(10): 103202, 2021 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-34703997

RESUMO

The identification of proteins at the single-molecule level would open exciting new venues in biological research and disease diagnostics. Previously, we proposed a nanopore-based method for protein identification called chop-n-drop fingerprinting, in which the fragmentation pattern induced and measured by a proteasome-nanopore construct is used to identify single proteins. In the simulation study presented here, we show that 97.1% of human proteome constituents are uniquely identified under close to ideal measuring circumstances, using a simple alignment-based classification method. We show that our method is robust against experimental error, as 69.4% can still be identified if the resolution is twice as low as currently attainable, and 10% of proteasome restriction sites and protein fragments are randomly ignored. Based on these results and our experimental proof of concept, we argue that chop-n-drop fingerprinting has the potential to make cost-effective single-molecule protein identification feasible in the near future.

19.
Front Microbiol ; 12: 748178, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34707596

RESUMO

Metabolism is the set of biochemical reactions of an organism that enables it to assimilate nutrients from its environment and to generate building blocks for growth and proliferation. It forms a complex network that is intertwined with the many molecular and cellular processes that take place within cells. Systems biology aims to capture the complexity of cells, organisms, or communities by reconstructing models based on information gathered by high-throughput analyses (omics data) and prior knowledge. One type of model is a genome-scale metabolic model (GEM) that allows studying the distributions of metabolic fluxes, i.e., the "mass-flow" through the network of biochemical reactions. GEMs are nowadays widely applied and have been reconstructed for various microbial pathogens, either in a free-living state or in interaction with their hosts, with the aim to gain insight into mechanisms of pathogenicity. In this review, we first introduce the principles of systems biology and GEMs. We then describe how metabolic modeling can contribute to unraveling microbial pathogenesis and host-pathogen interactions, with a specific focus on oomycete plant pathogens and in particular Phytophthora infestans. Subsequently, we review achievements obtained so far and identify and discuss potential pitfalls of current models. Finally, we propose a workflow for reconstructing high-quality GEMs and elaborate on the resources needed to advance a system biology approach aimed at untangling the intimate interactions between plants and pathogens.

20.
PLoS One ; 16(9): e0253102, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34591846

RESUMO

In genomics, optical mapping technology provides long-range contiguity information to improve genome sequence assemblies and detect structural variation. Originally a laborious manual process, Bionano Genomics platforms now offer high-throughput, automated optical mapping based on chips packed with nanochannels through which unwound DNA is guided and the fluorescent DNA backbone and specific restriction sites are recorded. Although the raw image data obtained is of high quality, the processing and assembly software accompanying the platforms is closed source and does not seem to make full use of data, labeling approximately half of the measured signals as unusable. Here we introduce two new software tools, independent of Bionano Genomics software, to extract and process molecules from raw images (OptiScan) and to perform molecule-to-molecule and molecule-to-reference alignments using a novel signal-based approach (OptiMap). We demonstrate that the molecules detected by OptiScan can yield better assemblies, and that the approach taken by OptiMap results in higher use of molecules from the raw data. These tools lay the foundation for a suite of open-source methods to process and analyze high-throughput optical mapping data. The Python implementations of the OptiTools are publicly available through http://www.bif.wur.nl/.


Assuntos
Genômica/métodos , Mapeamento por Restrição Óptica/métodos , Mapeamento Cromossômico , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...